A Survey of Probabilistic Record Matching Models, Techniques and Tools
نویسنده
چکیده
Probabilistic record linkage regards the use of stochastic decision models to solve the problem of record linkage (also known as record matching). Data quality has became a key aspect in many institutions and the demand for novel, effective techniques is increasing. Record linkage in general has been studied in the last three decades and a solid probabilistic decision framework has been proposed along with several extensions and specific estimation methods. This paper is a survey work narrowed to the most recent and promising approaches also including a selection of data cleansing tools based on probabilistic decision models.
منابع مشابه
A Comparative Study of the Neural Network, Fuzzy Logic, and Nero-fuzzy Systems in Seismic Reservoir Characterization: An Example from Arab (Surmeh) Reservoir as an Iranian Gas Field, Persian Gulf Basin
Intelligent reservoir characterization using seismic attributes and hydraulic flow units has a vital role in the description of oil and gas traps. The predicted model allows an accurate understanding of the reservoir quality, especially at the un-cored well location. This study was conducted in two major steps. In the first step, the survey compared different intelligent techniques to discover ...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملAdaptive Approximate Record Matching
Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...
متن کاملA Semi-Analytical Method for History Matching and Improving Geological Models of Layered Reservoirs: CGM Analytical Method
History matching is used to constrain flow simulations and reduce uncertainty in forecasts. In this work, we revisited some fundamental engineering tools for predicting waterflooding behavior to better understand the flaws in our simulation and thus find some models which are more accurate with better matching. The Craig-Geffen-Morse (CGM) analytical method was used to predict recovery performa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008